Proclaim return types and other fixes needed for CCCL 3.2#5375
Proclaim return types and other fixes needed for CCCL 3.2#5375gforsyth merged 5 commits intorapidsai:mainfrom
Conversation
b53929b to
9463815
Compare
| std::vector<size_t> h_distance_offsets(global_max_distance + 1); | ||
| // Need global_max_distance + 2 elements: one for each distance level (0 to global_max_distance) | ||
| // plus a sentinel at the end for CUB segmented sort end offsets | ||
| std::vector<size_t> h_distance_offsets(global_max_distance + 2); |
There was a problem hiding this comment.
This appears to be an off-by-one error that results in an illegal memory access in SortPairs.
| // Process each chunk - sort consecutive arrays directly in-place | ||
| // Allocate output buffers for CUB sort (input/output cannot overlap) | ||
| rmm::device_uvector<vertex_t> sorted_vertices(total_vertices, handle.get_stream()); | ||
| rmm::device_uvector<origin_t> sorted_sources(total_vertices, handle.get_stream()); |
There was a problem hiding this comment.
Tests were failing so I was investigating this file, and I think this comment points to a violation of the CUB API preconditions. The inputs and outputs of the cub::DeviceSegmentedSort::SortPairs call overlapped. See the docs for cub::DeviceSegmentedSort::SortPairs.
As a workaround, I am initializing an output buffer for the sorted results, and then moving it back to the original variable once computation is complete.
alexbarghi-nv
left a comment
There was a problem hiding this comment.
Approved but should be re-reviewed by @ChuckHastings or @seunghwak when they're back from vacation.
|
Thanks! I agree -- these should be reviewed more but I don't want to break cuGraph with the CCCL 3.2 update. Thanks for taking a look and discussing offline @alexbarghi-nv and @rlratzel. |
Fixes needed for CCCL 3.2 compatibility.